PE Gondad BS Data





SE example
./bsmap -a /Volumes/Bay4\ scratch/temp/filtered_Unlabeled_NoIndex_L003_R1wID_trimmed.fastq -d  /Volumes/NGS\ Drive/Oyster\ Genome/oyster.v9_M.fa -o /Volumes/AquaculX/armina/BSMAP_output_trimmed_v9_M.sam -p 2


hummingbird
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam -p 8

note: can run gunzipped fastq


took about 2 hours

Total number of aligned reads:
pairs:       85147571 (50%)
single a:    16704916 (9.7%)
single b:    15703005 (9.2%)

Done.
Finished at Thu Dec 20 10:31:38 2012




running methratio on greenbird


python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -u -p -q -z -o /Volumes/web/whale/ce_bs/OUT_methratio_gonadPE_v9_90_A.txt -s /Volumes/Bay3/Software/samtools  /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam

total 136437582 valid mappings, 119535479 covered cytosines, average coverage: 11.15 fold.

try on hummingbird

python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -u -p -q -z -o /Volumes/web/whale/ce_bs/OUT_methratio_gonadPE_v9_90_B.txt -s /Users/Shared/Apps/bsmap-2.73/samtools /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam

total 136437582 valid mappings, 119535479 covered cytosines, average coverage: 11.15 fold.


Filter for + only
~63,000,000 lines
format: tabular, database: ?
Info: Filtering with c3=='+',
kept 50.01% of 119535480 valid lines (119535480 total lines).


trim col 4, the select CG

example 





#C+T >/= 10

@3.6 million



http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x.txt


WORKFLOW
http://eagle.fish.washington.edu/cnidarian/Galaxy-Workflow-methratio_processing.ga


complete wf
http://eagle.fish.washington.edu/cnidarian/Galaxy-Workflow-methratio_processing_BED.ga


methylated
~940,000 lines
format: tabular, database: ?
Info: Filtering with c5>0,
kept 26.97% of 3571019 valid lines (3571019 total lines).

http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_METHbed.txt

http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_NOmethbed.txt


Screen Recording.mov


Methylated CGs that overlap with CDS
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_METHbed_BED%20(OF).txt (37k)


not within CDS
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_METHbed_BED%20(notCDS).txt (119k)





another iteration
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_B.sam -p 8 -x 300 -z 64 -q 2 -w 100 -v 5



abort- writing lot data to terminal 





./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_C.sam -p 8 -x 300 -q 2 -w 100 -v 5




./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_C.sam -p 8 -x 300 -q 2 -w 100 -v 5

BSMAP v2.73
Start at:  Thu Dec 20 11:56:09 2012

Input reference file: /Volumes/web/whale/ce_bs/oyster.v9_90.fa      (format: FASTA)
Load in 1670 db seqs, total size 502809129 bp. 12 secs passed
total_kmers: 43046721
Create seed table. 32 secs passed
max number of mismatches: 5     max gap size: 0
kmer cut-off ratio:5e-07
max multi-hits: 100     max Ns: 5     seed size: 16     index interval: 4
quality cutoff: 2     base quality char: '!'
min fragment size:28     max fragemt size:300
start from read #1     end at read #4294967295
additional alignment: T in reads => C in reference
mapping strand (read_1): ++,-+
mapping strand (read_2): +-,--
Pair-end alignment(8 threads)
Input read file #1: /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz      (format: gzipped FASTQ)
Input read file #2: /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz      (format: gzipped FASTQ)
Output file: /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_C.sam      (format: SAM)

Total number of aligned reads:
pairs:       87768222 (51%)
single a:    18091279 (11%)
single b:    16521277 (9.6%)






./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_M.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_M_A.sam -p 6

Total number of aligned reads:
pairs:       32019169 (19%)
single a:    9632050 (5.6%)
single b:    9262145 (5.4%)
Done.





./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_A.sam -p 8 -x 300 -q 2 -w 100 -v 5


complete





./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_D.sam -p 12 -x 300 -q 2 -w 100 -v 8

Total number of aligned reads:
pairs:       93132679 (54%)
single a:    17844205 (10%)
single b:    16710672 (9.7%)
Done.


python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -u -p -q -z -o /Volumes/web/whale/ce_bs/OUT_methratio_gonadPE_v9_90_C.txt -s /Users/Shared/Apps/bsmap-2.73/samtools /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90_D.sam